Overview

Brought to you by YData

Dataset statistics

Number of variables16
Number of observations4238
Missing cells645
Missing cells (%)1.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory529.9 KiB
Average record size in memory128.0 B

Variable types

Categorical8
Numeric8

Alerts

cigsPerDay is highly overall correlated with currentSmokerHigh correlation
currentSmoker is highly overall correlated with cigsPerDayHigh correlation
diaBP is highly overall correlated with prevalentHyp and 1 other fieldsHigh correlation
diabetes is highly overall correlated with glucoseHigh correlation
glucose is highly overall correlated with diabetesHigh correlation
prevalentHyp is highly overall correlated with diaBP and 1 other fieldsHigh correlation
sysBP is highly overall correlated with diaBP and 1 other fieldsHigh correlation
BPMeds is highly imbalanced (80.7%) Imbalance
prevalentStroke is highly imbalanced (94.8%) Imbalance
diabetes is highly imbalanced (82.8%) Imbalance
education has 105 (2.5%) missing values Missing
BPMeds has 53 (1.3%) missing values Missing
totChol has 50 (1.2%) missing values Missing
glucose has 388 (9.2%) missing values Missing
cigsPerDay has 2144 (50.6%) zeros Zeros

Reproduction

Analysis started2024-11-09 10:47:33.351223
Analysis finished2024-11-09 10:47:40.220080
Duration6.87 seconds
Software versionydata-profiling vv4.12.0
Download configurationconfig.json

Variables

male
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size207.1 KiB
0
2419 
1
1819 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters4238
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row0
3rd row1
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 2419
57.1%
1 1819
42.9%

Length

2024-11-09T16:47:40.357654image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-11-09T16:47:40.451214image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
0 2419
57.1%
1 1819
42.9%

Most occurring characters

ValueCountFrequency (%)
0 2419
57.1%
1 1819
42.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 4238
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 2419
57.1%
1 1819
42.9%

Most occurring scripts

ValueCountFrequency (%)
Common 4238
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 2419
57.1%
1 1819
42.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4238
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 2419
57.1%
1 1819
42.9%

age
Real number (ℝ)

Distinct39
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean49.584946
Minimum32
Maximum70
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size33.2 KiB
2024-11-09T16:47:40.546644image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum32
5-th percentile37
Q142
median49
Q356
95-th percentile64
Maximum70
Range38
Interquartile range (IQR)14

Descriptive statistics

Standard deviation8.5721599
Coefficient of variation (CV)0.17287828
Kurtosis-0.98963585
Mean49.584946
Median Absolute Deviation (MAD)7
Skewness0.22814578
Sum210141
Variance73.481926
MonotonicityNot monotonic
2024-11-09T16:47:40.647532image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=39)
ValueCountFrequency (%)
40 191
 
4.5%
46 182
 
4.3%
42 180
 
4.2%
41 174
 
4.1%
48 173
 
4.1%
39 169
 
4.0%
44 166
 
3.9%
45 162
 
3.8%
43 159
 
3.8%
52 149
 
3.5%
Other values (29) 2533
59.8%
ValueCountFrequency (%)
32 1
 
< 0.1%
33 5
 
0.1%
34 18
 
0.4%
35 42
 
1.0%
36 84
2.0%
37 92
2.2%
38 144
3.4%
39 169
4.0%
40 191
4.5%
41 174
4.1%
ValueCountFrequency (%)
70 2
 
< 0.1%
69 7
 
0.2%
68 18
 
0.4%
67 45
1.1%
66 38
 
0.9%
65 57
1.3%
64 93
2.2%
63 110
2.6%
62 99
2.3%
61 110
2.6%

education
Categorical

Missing 

Distinct4
Distinct (%)0.1%
Missing105
Missing (%)2.5%
Memory size215.8 KiB
1.0
1720 
2.0
1253 
3.0
687 
4.0
473 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters12399
Distinct characters6
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row4.0
2nd row2.0
3rd row1.0
4th row3.0
5th row3.0

Common Values

ValueCountFrequency (%)
1.0 1720
40.6%
2.0 1253
29.6%
3.0 687
 
16.2%
4.0 473
 
11.2%
(Missing) 105
 
2.5%

Length

2024-11-09T16:47:40.735645image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-11-09T16:47:40.935858image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
1.0 1720
41.6%
2.0 1253
30.3%
3.0 687
 
16.6%
4.0 473
 
11.4%

Most occurring characters

ValueCountFrequency (%)
. 4133
33.3%
0 4133
33.3%
1 1720
13.9%
2 1253
 
10.1%
3 687
 
5.5%
4 473
 
3.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 8266
66.7%
Other Punctuation 4133
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 4133
50.0%
1 1720
20.8%
2 1253
 
15.2%
3 687
 
8.3%
4 473
 
5.7%
Other Punctuation
ValueCountFrequency (%)
. 4133
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 12399
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
. 4133
33.3%
0 4133
33.3%
1 1720
13.9%
2 1253
 
10.1%
3 687
 
5.5%
4 473
 
3.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 12399
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 4133
33.3%
0 4133
33.3%
1 1720
13.9%
2 1253
 
10.1%
3 687
 
5.5%
4 473
 
3.8%

currentSmoker
Categorical

High correlation 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size207.1 KiB
0
2144 
1
2094 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters4238
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
0 2144
50.6%
1 2094
49.4%

Length

2024-11-09T16:47:41.010028image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-11-09T16:47:41.071877image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
0 2144
50.6%
1 2094
49.4%

Most occurring characters

ValueCountFrequency (%)
0 2144
50.6%
1 2094
49.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 4238
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 2144
50.6%
1 2094
49.4%

Most occurring scripts

ValueCountFrequency (%)
Common 4238
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 2144
50.6%
1 2094
49.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4238
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 2144
50.6%
1 2094
49.4%

cigsPerDay
Real number (ℝ)

High correlation  Zeros 

Distinct33
Distinct (%)0.8%
Missing29
Missing (%)0.7%
Infinite0
Infinite (%)0.0%
Mean9.0030886
Minimum0
Maximum70
Zeros2144
Zeros (%)50.6%
Negative0
Negative (%)0.0%
Memory size33.2 KiB
2024-11-09T16:47:41.142650image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q320
95-th percentile30
Maximum70
Range70
Interquartile range (IQR)20

Descriptive statistics

Standard deviation11.920094
Coefficient of variation (CV)1.3240005
Kurtosis1.0233558
Mean9.0030886
Median Absolute Deviation (MAD)0
Skewness1.2479099
Sum37894
Variance142.08863
MonotonicityNot monotonic
2024-11-09T16:47:41.230164image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=33)
ValueCountFrequency (%)
0 2144
50.6%
20 734
 
17.3%
30 217
 
5.1%
15 210
 
5.0%
10 143
 
3.4%
9 130
 
3.1%
5 121
 
2.9%
3 100
 
2.4%
40 80
 
1.9%
1 67
 
1.6%
Other values (23) 263
 
6.2%
ValueCountFrequency (%)
0 2144
50.6%
1 67
 
1.6%
2 18
 
0.4%
3 100
 
2.4%
4 9
 
0.2%
5 121
 
2.9%
6 18
 
0.4%
7 12
 
0.3%
8 11
 
0.3%
9 130
 
3.1%
ValueCountFrequency (%)
70 1
 
< 0.1%
60 11
 
0.3%
50 6
 
0.1%
45 3
 
0.1%
43 56
 
1.3%
40 80
 
1.9%
38 1
 
< 0.1%
35 22
 
0.5%
30 217
5.1%
29 1
 
< 0.1%

BPMeds
Categorical

Imbalance  Missing 

Distinct2
Distinct (%)< 0.1%
Missing53
Missing (%)1.3%
Memory size215.5 KiB
0.0
4061 
1.0
 
124

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters12555
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0 4061
95.8%
1.0 124
 
2.9%
(Missing) 53
 
1.3%

Length

2024-11-09T16:47:41.319293image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-11-09T16:47:41.390344image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
0.0 4061
97.0%
1.0 124
 
3.0%

Most occurring characters

ValueCountFrequency (%)
0 8246
65.7%
. 4185
33.3%
1 124
 
1.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 8370
66.7%
Other Punctuation 4185
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 8246
98.5%
1 124
 
1.5%
Other Punctuation
ValueCountFrequency (%)
. 4185
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 12555
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 8246
65.7%
. 4185
33.3%
1 124
 
1.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 12555
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 8246
65.7%
. 4185
33.3%
1 124
 
1.0%

prevalentStroke
Categorical

Imbalance 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size207.1 KiB
0
4213 
1
 
25

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters4238
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 4213
99.4%
1 25
 
0.6%

Length

2024-11-09T16:47:41.458716image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-11-09T16:47:41.530182image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
0 4213
99.4%
1 25
 
0.6%

Most occurring characters

ValueCountFrequency (%)
0 4213
99.4%
1 25
 
0.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 4238
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 4213
99.4%
1 25
 
0.6%

Most occurring scripts

ValueCountFrequency (%)
Common 4238
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 4213
99.4%
1 25
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4238
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 4213
99.4%
1 25
 
0.6%

prevalentHyp
Categorical

High correlation 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size207.1 KiB
0
2922 
1
1316 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters4238
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row1
5th row0

Common Values

ValueCountFrequency (%)
0 2922
68.9%
1 1316
31.1%

Length

2024-11-09T16:47:41.601364image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-11-09T16:47:41.664932image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
0 2922
68.9%
1 1316
31.1%

Most occurring characters

ValueCountFrequency (%)
0 2922
68.9%
1 1316
31.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 4238
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 2922
68.9%
1 1316
31.1%

Most occurring scripts

ValueCountFrequency (%)
Common 4238
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 2922
68.9%
1 1316
31.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4238
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 2922
68.9%
1 1316
31.1%

diabetes
Categorical

High correlation  Imbalance 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size207.1 KiB
0
4129 
1
 
109

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters4238
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 4129
97.4%
1 109
 
2.6%

Length

2024-11-09T16:47:41.739004image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-11-09T16:47:41.807535image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
0 4129
97.4%
1 109
 
2.6%

Most occurring characters

ValueCountFrequency (%)
0 4129
97.4%
1 109
 
2.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 4238
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 4129
97.4%
1 109
 
2.6%

Most occurring scripts

ValueCountFrequency (%)
Common 4238
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 4129
97.4%
1 109
 
2.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4238
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 4129
97.4%
1 109
 
2.6%

totChol
Real number (ℝ)

Missing 

Distinct248
Distinct (%)5.9%
Missing50
Missing (%)1.2%
Infinite0
Infinite (%)0.0%
Mean236.72159
Minimum107
Maximum696
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size33.2 KiB
2024-11-09T16:47:41.881732image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum107
5-th percentile170
Q1206
median234
Q3263
95-th percentile312
Maximum696
Range589
Interquartile range (IQR)57

Descriptive statistics

Standard deviation44.590334
Coefficient of variation (CV)0.18836615
Kurtosis4.1315818
Mean236.72159
Median Absolute Deviation (MAD)29
Skewness0.87142201
Sum991390
Variance1988.2979
MonotonicityNot monotonic
2024-11-09T16:47:41.973221image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
240 85
 
2.0%
220 70
 
1.7%
260 62
 
1.5%
210 61
 
1.4%
232 59
 
1.4%
250 57
 
1.3%
200 56
 
1.3%
225 54
 
1.3%
230 54
 
1.3%
205 53
 
1.3%
Other values (238) 3577
84.4%
(Missing) 50
 
1.2%
ValueCountFrequency (%)
107 1
< 0.1%
113 1
< 0.1%
119 1
< 0.1%
124 1
< 0.1%
126 1
< 0.1%
129 1
< 0.1%
133 1
< 0.1%
135 2
< 0.1%
137 1
< 0.1%
140 2
< 0.1%
ValueCountFrequency (%)
696 1
 
< 0.1%
600 1
 
< 0.1%
464 1
 
< 0.1%
453 1
 
< 0.1%
439 1
 
< 0.1%
432 1
 
< 0.1%
410 3
0.1%
405 1
 
< 0.1%
398 1
 
< 0.1%
392 1
 
< 0.1%

sysBP
Real number (ℝ)

High correlation 

Distinct234
Distinct (%)5.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean132.35241
Minimum83.5
Maximum295
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size33.2 KiB
2024-11-09T16:47:42.066157image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum83.5
5-th percentile104
Q1117
median128
Q3144
95-th percentile175
Maximum295
Range211.5
Interquartile range (IQR)27

Descriptive statistics

Standard deviation22.038097
Coefficient of variation (CV)0.16651074
Kurtosis2.1550194
Mean132.35241
Median Absolute Deviation (MAD)13
Skewness1.1453621
Sum560909.5
Variance485.6777
MonotonicityNot monotonic
2024-11-09T16:47:42.156857image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
120 107
 
2.5%
130 102
 
2.4%
110 96
 
2.3%
115 89
 
2.1%
125 88
 
2.1%
124 84
 
2.0%
122 80
 
1.9%
126 73
 
1.7%
128 73
 
1.7%
123 72
 
1.7%
Other values (224) 3374
79.6%
ValueCountFrequency (%)
83.5 2
 
< 0.1%
85 1
 
< 0.1%
85.5 1
 
< 0.1%
90 2
 
< 0.1%
92 1
 
< 0.1%
92.5 2
 
< 0.1%
93 2
 
< 0.1%
93.5 2
 
< 0.1%
94 3
0.1%
95 7
0.2%
ValueCountFrequency (%)
295 1
 
< 0.1%
248 1
 
< 0.1%
244 1
 
< 0.1%
243 1
 
< 0.1%
235 1
 
< 0.1%
232 1
 
< 0.1%
230 1
 
< 0.1%
220 2
< 0.1%
217 1
 
< 0.1%
215 3
0.1%

diaBP
Real number (ℝ)

High correlation 

Distinct146
Distinct (%)3.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean82.893464
Minimum48
Maximum142.5
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size33.2 KiB
2024-11-09T16:47:42.245770image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum48
5-th percentile66
Q175
median82
Q389.875
95-th percentile104.575
Maximum142.5
Range94.5
Interquartile range (IQR)14.875

Descriptive statistics

Standard deviation11.91085
Coefficient of variation (CV)0.14368865
Kurtosis1.2770996
Mean82.893464
Median Absolute Deviation (MAD)7.5
Skewness0.71410218
Sum351302.5
Variance141.86834
MonotonicityNot monotonic
2024-11-09T16:47:42.339053image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
80 262
 
6.2%
82 152
 
3.6%
85 137
 
3.2%
70 135
 
3.2%
81 131
 
3.1%
84 122
 
2.9%
90 119
 
2.8%
78 116
 
2.7%
87 113
 
2.7%
75 108
 
2.5%
Other values (136) 2843
67.1%
ValueCountFrequency (%)
48 1
 
< 0.1%
50 1
 
< 0.1%
51 1
 
< 0.1%
52 2
 
< 0.1%
53 1
 
< 0.1%
54 1
 
< 0.1%
55 3
0.1%
56 2
 
< 0.1%
57 6
0.1%
57.5 3
0.1%
ValueCountFrequency (%)
142.5 1
 
< 0.1%
140 1
 
< 0.1%
136 2
 
< 0.1%
135 2
 
< 0.1%
133 2
 
< 0.1%
132 1
 
< 0.1%
130 5
0.1%
129 1
 
< 0.1%
128 1
 
< 0.1%
127.5 1
 
< 0.1%

BMI
Real number (ℝ)

Distinct1363
Distinct (%)32.3%
Missing19
Missing (%)0.4%
Infinite0
Infinite (%)0.0%
Mean25.802008
Minimum15.54
Maximum56.8
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size33.2 KiB
2024-11-09T16:47:42.425186image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum15.54
5-th percentile20.06
Q123.07
median25.4
Q328.04
95-th percentile32.782
Maximum56.8
Range41.26
Interquartile range (IQR)4.97

Descriptive statistics

Standard deviation4.0801111
Coefficient of variation (CV)0.15813153
Kurtosis2.6568387
Mean25.802008
Median Absolute Deviation (MAD)2.49
Skewness0.98197431
Sum108858.67
Variance16.647306
MonotonicityNot monotonic
2024-11-09T16:47:42.511127image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
22.91 18
 
0.4%
22.54 18
 
0.4%
23.48 18
 
0.4%
22.19 18
 
0.4%
23.09 16
 
0.4%
25.09 16
 
0.4%
23.1 13
 
0.3%
22.73 13
 
0.3%
25.23 13
 
0.3%
27.78 12
 
0.3%
Other values (1353) 4064
95.9%
(Missing) 19
 
0.4%
ValueCountFrequency (%)
15.54 1
< 0.1%
15.96 1
< 0.1%
16.48 1
< 0.1%
16.59 2
< 0.1%
16.61 1
< 0.1%
16.69 1
< 0.1%
16.71 1
< 0.1%
16.73 1
< 0.1%
16.75 1
< 0.1%
16.87 1
< 0.1%
ValueCountFrequency (%)
56.8 1
< 0.1%
51.28 1
< 0.1%
45.8 1
< 0.1%
45.79 1
< 0.1%
44.71 1
< 0.1%
44.55 1
< 0.1%
44.27 1
< 0.1%
44.09 1
< 0.1%
43.69 1
< 0.1%
43.67 1
< 0.1%

heartRate
Real number (ℝ)

Distinct73
Distinct (%)1.7%
Missing1
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean75.878924
Minimum44
Maximum143
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size33.2 KiB
2024-11-09T16:47:42.598597image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum44
5-th percentile60
Q168
median75
Q383
95-th percentile98
Maximum143
Range99
Interquartile range (IQR)15

Descriptive statistics

Standard deviation12.026596
Coefficient of variation (CV)0.15849719
Kurtosis0.90748324
Mean75.878924
Median Absolute Deviation (MAD)7
Skewness0.64448173
Sum321499
Variance144.63902
MonotonicityNot monotonic
2024-11-09T16:47:42.695898image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
75 563
 
13.3%
80 385
 
9.1%
70 305
 
7.2%
60 231
 
5.5%
85 227
 
5.4%
72 222
 
5.2%
65 197
 
4.6%
90 172
 
4.1%
68 151
 
3.6%
100 98
 
2.3%
Other values (63) 1686
39.8%
ValueCountFrequency (%)
44 1
 
< 0.1%
45 2
 
< 0.1%
46 1
 
< 0.1%
47 1
 
< 0.1%
48 5
 
0.1%
50 22
0.5%
51 1
 
< 0.1%
52 17
0.4%
53 11
0.3%
54 12
0.3%
ValueCountFrequency (%)
143 1
 
< 0.1%
140 1
 
< 0.1%
130 1
 
< 0.1%
125 3
 
0.1%
122 2
 
< 0.1%
120 7
 
0.2%
115 5
 
0.1%
112 3
 
0.1%
110 36
0.8%
108 8
 
0.2%

glucose
Real number (ℝ)

High correlation  Missing 

Distinct143
Distinct (%)3.7%
Missing388
Missing (%)9.2%
Infinite0
Infinite (%)0.0%
Mean81.966753
Minimum40
Maximum394
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size33.2 KiB
2024-11-09T16:47:42.788756image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum40
5-th percentile62
Q171
median78
Q387
95-th percentile108.55
Maximum394
Range354
Interquartile range (IQR)16

Descriptive statistics

Standard deviation23.959998
Coefficient of variation (CV)0.29231362
Kurtosis58.674278
Mean81.966753
Median Absolute Deviation (MAD)8
Skewness6.2134019
Sum315572
Variance574.08151
MonotonicityNot monotonic
2024-11-09T16:47:42.884343image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
75 193
 
4.6%
77 167
 
3.9%
73 156
 
3.7%
80 152
 
3.6%
70 152
 
3.6%
83 151
 
3.6%
78 148
 
3.5%
74 141
 
3.3%
85 127
 
3.0%
76 127
 
3.0%
Other values (133) 2336
55.1%
(Missing) 388
 
9.2%
ValueCountFrequency (%)
40 2
 
< 0.1%
43 1
 
< 0.1%
44 2
 
< 0.1%
45 4
0.1%
47 3
0.1%
48 1
 
< 0.1%
50 3
0.1%
52 2
 
< 0.1%
53 5
0.1%
54 5
0.1%
ValueCountFrequency (%)
394 2
< 0.1%
386 1
< 0.1%
370 1
< 0.1%
368 1
< 0.1%
348 1
< 0.1%
332 1
< 0.1%
325 1
< 0.1%
320 1
< 0.1%
297 1
< 0.1%
294 1
< 0.1%

TenYearCHD
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size207.1 KiB
0
3594 
1
644 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters4238
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row1
5th row0

Common Values

ValueCountFrequency (%)
0 3594
84.8%
1 644
 
15.2%

Length

2024-11-09T16:47:42.969954image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-11-09T16:47:43.035923image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
0 3594
84.8%
1 644
 
15.2%

Most occurring characters

ValueCountFrequency (%)
0 3594
84.8%
1 644
 
15.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 4238
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 3594
84.8%
1 644
 
15.2%

Most occurring scripts

ValueCountFrequency (%)
Common 4238
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 3594
84.8%
1 644
 
15.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4238
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 3594
84.8%
1 644
 
15.2%

Interactions

2024-11-09T16:47:39.245926image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-09T16:47:34.130337image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-09T16:47:34.822298image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-09T16:47:35.758394image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-09T16:47:37.043428image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-09T16:47:37.585360image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-09T16:47:38.138136image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-09T16:47:38.677026image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-09T16:47:39.322896image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-09T16:47:34.249908image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-09T16:47:34.902026image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-09T16:47:35.877401image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-09T16:47:37.118498image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-09T16:47:37.658371image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-09T16:47:38.206908image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-09T16:47:38.750309image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-09T16:47:39.396566image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-09T16:47:34.356362image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-09T16:47:34.988386image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-09T16:47:36.227571image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-09T16:47:37.196165image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-09T16:47:37.730052image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-09T16:47:38.280582image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-09T16:47:38.824126image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-09T16:47:39.471252image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-09T16:47:34.440534image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-09T16:47:35.082461image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-09T16:47:36.374600image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-09T16:47:37.262906image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-09T16:47:37.801960image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-09T16:47:38.349753image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-09T16:47:38.902195image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-09T16:47:39.540048image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-09T16:47:34.503292image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-09T16:47:35.161307image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-09T16:47:36.529388image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-09T16:47:37.320418image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-09T16:47:37.865452image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-09T16:47:38.413312image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-09T16:47:38.969624image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-09T16:47:39.609427image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-09T16:47:34.567516image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-09T16:47:35.237106image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-09T16:47:36.781943image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-09T16:47:37.383103image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-09T16:47:37.929585image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-09T16:47:38.478050image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-09T16:47:39.036525image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-09T16:47:39.674466image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-09T16:47:34.639855image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-09T16:47:35.372365image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-09T16:47:36.882758image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-09T16:47:37.443510image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-09T16:47:37.991511image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-09T16:47:38.541215image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-09T16:47:39.097894image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-09T16:47:39.749501image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-09T16:47:34.721370image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-09T16:47:35.544923image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-09T16:47:36.964968image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-09T16:47:37.512164image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-09T16:47:38.060992image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-09T16:47:38.607820image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-09T16:47:39.167481image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Correlations

2024-11-09T16:47:43.095016image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
BMIBPMedsTenYearCHDagecigsPerDaycurrentSmokerdiaBPdiabeteseducationglucoseheartRatemaleprevalentHypprevalentStrokesysBPtotChol
BMI1.0000.1360.0870.145-0.1400.1560.3750.1030.0840.0710.0560.2080.2930.2090.3240.148
BPMeds0.1361.0000.0840.1320.0230.0450.2160.0450.0000.0850.0690.0490.2590.1070.2830.069
TenYearCHD0.0870.0841.0000.2200.0580.0110.1560.0940.0840.1230.0000.0860.1760.0550.2100.077
age0.1450.1320.2201.000-0.2150.2260.2090.0970.1450.116-0.0150.0090.3010.0640.3910.289
cigsPerDay-0.1400.0230.058-0.2151.0000.846-0.0890.0000.042-0.0900.0780.3260.1040.000-0.111-0.041
currentSmoker0.1560.0450.0110.2260.8461.0000.1100.0400.0610.0820.0730.1970.1020.0260.1230.039
diaBP0.3750.2160.1560.209-0.0890.1101.0000.0500.0470.0470.1790.0690.6370.0400.7780.186
diabetes0.1030.0450.0940.0970.0000.0400.0501.0000.0420.7180.0460.0000.0750.0000.1280.090
education0.0840.0000.0840.1450.0420.0610.0470.0421.0000.0320.0430.1430.0890.0240.0720.022
glucose0.0710.0850.1230.116-0.0900.0820.0470.7180.0321.0000.0970.0000.0870.0310.1170.030
heartRate0.0560.0690.000-0.0150.0780.0730.1790.0460.0430.0971.0000.1140.1400.0000.1710.089
male0.2080.0490.0860.0090.3260.1970.0690.0000.1430.0000.1141.0000.0000.0000.1050.083
prevalentHyp0.2930.2590.1760.3010.1040.1020.6370.0750.0890.0870.1400.0001.0000.0700.7090.160
prevalentStroke0.2090.1070.0550.0640.0000.0260.0400.0000.0240.0310.0000.0000.0701.0000.0570.000
sysBP0.3240.2830.2100.391-0.1110.1230.7780.1280.0720.1170.1710.1050.7090.0571.0000.224
totChol0.1480.0690.0770.289-0.0410.0390.1860.0900.0220.0300.0890.0830.1600.0000.2241.000

Missing values

2024-11-09T16:47:39.863213image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-11-09T16:47:40.024959image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-11-09T16:47:40.159773image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

maleageeducationcurrentSmokercigsPerDayBPMedsprevalentStrokeprevalentHypdiabetestotCholsysBPdiaBPBMIheartRateglucoseTenYearCHD
01394.000.00.0000195.0106.070.026.9780.077.00
10462.000.00.0000250.0121.081.028.7395.076.00
21481.0120.00.0000245.0127.580.025.3475.070.00
30613.0130.00.0010225.0150.095.028.5865.0103.01
40463.0123.00.0000285.0130.084.023.1085.085.00
50432.000.00.0010228.0180.0110.030.3077.099.00
60631.000.00.0000205.0138.071.033.1160.085.01
70452.0120.00.0000313.0100.071.021.6879.078.00
81521.000.00.0010260.0141.589.026.3676.079.00
91431.0130.00.0010225.0162.0107.023.6193.088.00
maleageeducationcurrentSmokercigsPerDayBPMedsprevalentStrokeprevalentHypdiabetestotCholsysBPdiaBPBMIheartRateglucoseTenYearCHD
42280501.000.00.0011260.0190.0130.043.6785.0260.00
42290513.0120.00.0010251.0140.080.025.6075.0NaN0
42300561.013.00.0010268.0170.0102.022.8957.0NaN0
42311583.000.00.0010187.0141.081.024.9680.081.00
42321681.000.00.0010176.0168.097.023.1460.079.01
42331501.011.00.0010313.0179.092.025.9766.086.01
42341513.0143.00.0000207.0126.580.019.7165.068.00
42350482.0120.0NaN000248.0131.072.022.0084.086.00
42360441.0115.00.0000210.0126.587.019.1686.0NaN0
42370522.000.00.0000269.0133.583.021.4780.0107.00